Incorporating Tweet Relationships into Topic Derivation
نویسندگان
چکیده
With its rapid users growth, Twitter has become an essential source of information about what events are happening in the world. It is critical to have the ability to derive the topics from Twitter messages (tweets), that is, to determine and characterize the main topics of the Twitter messages (tweets). However, tweets are very short in nature and therefore the frequency of term co-occurrences is very low. The sparsity in the relationship between tweets and terms leads to a poor characterization of the topics when only the content of the tweets is used. In this paper, we exploit the relationships between tweets and propose intLDA, a variant of Latent Dirichlet Allocation (LDA) that goes beyond content and directly incorporates the relationship between tweets. We have conducted experiments on a Twitter dataset and evaluated the performance in terms of both topic coherence and tweettopic accuracy. Our experiments show that intLDA outperforms methods that do not use relationship information. Keywords-Topic Derivation; Twitter; Tweets Relationship;
منابع مشابه
Derivation of Henderson's Method of Incorporating Artificial Insemination Sire Evaluations into Intraherd Prediction of Breeding Values
A derivation is given for the method proposed by Henderson to incorporate evaluations of sires based on daughter records in other herds into the intraherd prediction of breeding values. The derivation consists of pretending the daughters in other herds have records which have been adjusted for the herd-year-season effects of their origin. These equations then are absorbed into the equation for ...
متن کاملEntity Tracking in Real-Time Using Sub-topic Detection on Twitter
The velocity, volume and variety with which Twitter generates text is increasing exponentially. It is critical to determine latent sub-topics from such tweet data at any given point of time for providing better topic-wise search results relevant to users’ informational needs. The two main challenges in mining subtopics from tweets in real-time are (1) understanding the semantic and the conceptu...
متن کاملTopic Evolutionary Tweet Stream Clustering Algorithm and TCV Rank Summarization
Tweet are being created short text message and shared for both users and data analysts. Twitter which receive over 400 million tweets per day has emerged as an invaluable source of news, blogs, opinions and more. our proposed work consists three components tweet stream clustering to cluster tweet using k-means cluster algorithm and second tweet cluster vector technique to generate rank summariz...
متن کاملImproving Twitter Sentiment Classification Using Topic-Enriched Multi-Prototype Word Embeddings
It has been shown that learning distributed word representations is highly useful for Twitter sentiment classification. Most existing models rely on a single distributed representation for each word. This is problematic for sentiment classification because words are often polysemous and each word can contain different sentiment polarities under different topics. We address this issue by learnin...
متن کاملTalk of the City: Our Tweets, Our Community Happiness
The literature of urban sociology and that of psychology have separately established two relationships: the first has linked characteristics of a community to its residents’ well-being, the second has linked well-being of individuals to their use of words. No one has hitherto explored the potential transitive relationship that between characteristics of a community and its residents’ use of wor...
متن کامل